Picture for Gengsheng Li

Gengsheng Li

Visual-Advantage On-Policy Distillation for Vision-Language Models

Add code
May 21, 2026
Viaarxiv icon

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Add code
Apr 02, 2026
Viaarxiv icon

R-Diverse: Mitigating Diversity Illusion in Self-Play LLM Training

Add code
Feb 16, 2026
Viaarxiv icon